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1.  Introduction 

Let  the  random  variables  Y.(i  »  1,  2,  .  .  .  ,  n)  be  independent  and 
have  the  same  continuous  distribution  function  F(x) .   Let  the  ordered  sample 
be  represented  by  X-  <_  X„  £  ,  .  .  £  X  .   From  the  assumption  that  F(x)  is 

continuous  and  that  the  random  variables  are  Independent,  it  follows  that 
the  probability  of  any  two  X, 's  being  equal  is  zero. 

We  define  the  empirical  distribution  function  F  (x)  as: 

n 

f     0  for  X  <  X. 

P«(^>  '       <      i  for  X.  <  X  <  X.^, ,  i  -  1,  .  .  .  ,  n-1 

n        \  ^  ""     1+1  . 

V  1         for  X  <  X  . 

n  — 

It  is  known  that  the  probability  that  the  sequence  F  (x)  converges  to 

F(x)  as  n  -^  «,  uniformly  in  x  (-  «»  <  x  <  +  ») ,  equals  one  (see  Fis^,  1963, 
p.  391). 
Let 

D^  -    sup   [F(x)  -  F  (x)]  . 

-«o<X<+<» 

The  distribution  of  D^  was  given  by  A.  Wald  and  J.  Wolfowitz  (1939) 

and  by  Z.  W.  Bimbaum  and  F.  H.  Tingey  (1951).   The  asymptotic  expression 

for  the  distribution  of  D   was  given  by  N  .  Smirnov  (1939) . 

In  the  present  report,  the  distribution  of  D   is  studied  in  section  2. 

n 

In  section  3  the  power  of  the  test  based  on  D   is  discussed.   Discussions 

n 

in  section  2  and  section  3  are  mainly  based  on  Bimbaum  and  Tingey  (1951)  and 

Bimbaum  (1953).   In  section  A  the  greatest  lower  bound  for  the  power  of  the 

test  is  obtained  under  a  slight  modification  of  Bimbaum* s  assumption  given 


in  section  3.  In  section  5  numerical  tables  are  obtained  in  order  to  make 
a  comparison  of  the  power  of  the  test  with  that  of  a  parametric  test  under 
the  assumption  of  normality. 


2.  The  Distribution  of  D  . 

n 

Let  F  (x)  be  the  empirical  distribution  function  determined  by  a  random 

sa'aple  (ordered)  of  size  n  from  the  continuous  distribution  function  F(x) . 
It  is  known  that  the  probability 

(1)  P(D^  <  e) . 

where  e  is  a  constant,  is  independent  of  F(x)  (Wald  and  Wolfowitz,  1939). 
Hence  we  assume  that  F(x)  is  the  rectangular  distribution  in  the  interval 
[0,  1],  namely, 

/■       0       for  X  <  0 
F(x)  -/x       forO<_x<l 
[   1       for  1  <,  X 
Figure  1  will  show  that  (1)  is  equal  to  the  probability  that  the  ordered 
sample: 

0<Xt<X_<..,<X  <1 
—  1—2—      —  n  — 


satisfies  the  condition: 


i-1 


X.  <  min  (• 
1  —       n 


+  e,  1)  for  i  -  1,  2, 


i-1 


F(x)  -  X 


Ox 


Figure  1. 


We  know  that  the  probability  element  of  (X  ,  X  ,  .  .  ,  ,  X  )  is 

nldx,dx_.  .  .  dx   for  the  region  (x.  <  x_  <  ,  .  .  <  x  )  and 
J.     ^  n  i.         Z  n 

O.dx.dx-,  .  .  dx   elsewhere.  Therefore  we  conclude 
12       n 

(2)      PCD^"^  <  c) 

re      z'n"^^     An"^^^-"-       /^ 

"^JaL  '"J  /     •••y     *^a---^\+2^^k+r--'^'^2*^l» 

"^°    '^l        ^^     ^\+l     ^Vl 

wliere  k  is  the  greatest  integer  j   such  that  •^  +  e  <  1  . 

The  following  theorem  is  from  Bimbaum  and  Tingey  (1951) .  This  proof  is 
ail  expanded  version  of  their  proof, 

THEOREM,   For  0  <  e  <_  1, 

P(dJ  <  e)  -  1  -  e  I  (J)  (1  -  e  -  1)    (e  +  -^) 

Before  proving  the  theorem,  let  us  give  the  following  two  formulae; 
nfjnely,  for  any  integer  k,  1  <^  k  <^  n. 


(S)     /      I  ...     I  <ix_  ...  dit,^,dx. ii=±- 


-  +e      -  +e 

^1  ^k 

The  formula  (3)  is  easy  to  show  by  induction.   As  for  (4),  let  us  assume  that 
it  is  valid  for  k,  and  put 


4 


^2  -  x^  +  ^1 


X3  -  x^  +  y^ 


aad 


1  J. 

—  +e  -  Xt  ■  e '  . 

n      1 


Then  we  have,  for  k+1. 


1  ^     2  ^        k+1  ^ 
—  +e  ,  —  +e      , +E 


f(e,k+l,n)  "  /    /       /        .../       dyL^^^,,,d.7i^dyi^Ax.^ 
*1      ^^2         'Sc+l 


e'   /  n 


i+e-      i^+e- 


0  'y^  Jj^ 

By  the  assumption  of  induction. 


dyj^^l-.-dy^dy^  V     dx^ 


f(e,k+l,n)  -   I         — S^ (£.  +k±l)      , 

Jq       (k+1)  I  ""  1 


/ 


(^•^^-^1')  u+2  ^ 

—2 i—     .    (  i^  +  e  -  X.)     dx.    . 

0  (k+l)l  "  ^  ^ 


Cue  can  easily  verify  that  the  last  integral  yields  (4)  with  k+l  instead 
cf  k,  namely. 


e ,  ^  k+2. 


k+l 


(k+2) I 

Therefore  (4)  is  valid  for  any  integer  k,  1  <.  k  £  n  . 
Proof  of  the  theorem:  From  (2)  and  (3)  we  have 


P(V  -  ^^   '   *^^^'  ^'  ^^* 


where  J(e,  k,  n)  -  nl 


-  —  +c    ,—  +t 
■c  /n      f  n 


0  J 


n-k-1 


(^-"k+l> 
(n-k-1) 1 


^k+1  •••  ^2^^1 


Then, 


J(£,  k,  n)  =  nl 


e  rn 


0  J 


k-1 
n 


^k-1 


+e 


(1-x,) 


n-k 


\      (n-k) ! 


>   dx,  ...dx-dx, 
(n-k) I  j        ^  2  1 


n! 


e  /n 


0  ^x. 


k-1 
n 


'k-1 


+E 


(n-k) I 


n-k 


dx  . . ,dx-dx^ 


nl 
(n-k) I 


n-k 


n 


(1  -  r  -O 


n 


k-1 
n 


Vi 


+e 


dXj^,.,dx2dx 


With  this  and  from  (4) ,  we  obtain 


,     n-k      k-1 
J  -  (e.k.n)  =  J(e.k-l,n)  -  e(JJ)  (1  -  ^  -  e)    (e+  p 


A;)plying  this  procedure  successively  will  give  us 
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k  .  4  . 

J(e,k,n)   -  J(e,0,n)   -  e     I     (^)  (1  -  J-  -  e)"-J    (e  +  J-) 

1-1     J  '^  .  ^ 


J-1 


F;Lnally  noting  that 


J(e,0,n)   -  n! 


n-1 

f        <^--l> 
0  (n-1) 1 


dxj^  -  1  -  (1-e)      , 


we  have 


P(D^^  <.£)   =  J(e.k,n) 


-  1  -  e     I     (")(1  -  c-^)""^    (e  +^^'^  . 
j-0    J  " 


n 


Thus  the  proof  is  complete. 
If  we  let 


n 


sup    [F  (x)  -  F(x)]  , 


— «i><X<+» 

then  it  can  be  shown,  by  the  symmetry  of  D  "*"  and  D  ~.  that 


n 


n 


P(D  "  <  e)  -  P(D  ■*■  <  e) 
n  —        n  —   • 


By  making  use  of  the  theorem,  we  can  compute,  for  given  values  of  a 


aid  n,  value  of  e  for  which 


(5) 


P(V  i  O  -  1  -  ot. 


where  0  <  o  <  1. 

Values  of  e  for  several  values  of  n  and  a  are  given  in  Table  1, 

taken  from  Bimbaum  and  Tingey  (1951). 

Table  1.  The  values  of  e  for  (5). 


o/n 


.10 


.05 


.01 


.001 


5 
8 
10 
20 
*0 
50 


.4470 

.5094 

.6271 

.7480 

.3583 

.4096 

.5065 

.6130 

.3226 

.3687 

.4566 

.5550 

.2316 

.2647 

.3285 

.4018 

.1655 

.1891 

.2350 

.2877 

.1484 

.1696 

.2107 

.2581 

(.1517) 

(.1731) 

(.2146) 

(.2628) 

In  practice,  when  n  is  greater  than  50,  one  can  use  an  approximation 
based  on  the  asymptotic  distribution  function  of  D   due  to  Smimov  (1939) , 


namely. 


-2ne 


(6)  P(D^  <.  e)  -  1  -  e 

Numbers  in  parentheses  of  Table  1  are  due  to  (6)  for  n  -  50,  and  one 
can  see  that  these  are  fairly  accurate  when  n  -  50. 


3.  The  Power  of  the  Test  based  on  D  . 

n 


Let  F(x) ,  the  distribution  function  of  the  random  variable  X,  be 
continuous.  We  want  to  test  the  null  hypothesis 


H^:  F(x)  =  H(x) 
£ gainst  the  alternative  hypothesis 

H^:  F(x)  -  G(x)  . 
Ve   use  D^  for  the  test  statistic.   For  a  test  of  size  a  (significance 

level)  we  draw  a  sample  of  size  n  from  the  population  considered,  and 

compute  from  the  sample  the  empirical  distribution  function  F  (x) ,  We  will 

a 

reject  H  at  a  level  if  the  inequality 


0 


d"*"  >  e 
n 


is  satisfied,  where  e  is  the  value  determined  in  such  a  way  that, 
provided  H  is  true,  we  have 


o 


P(D^"^£  e  1  H(x))  -  1  -  o  . 
The  value  of  e  is  given  in  Table  1.   For  n  >  50,  we  use  the  asymptotic 


+ 


distribution  function  of  D   given  by  (6). 


a 


The  power  of  this  test  is  given  by 

Q  -  1  -  P(D^"^£  e  I  G(x)) 
Oae  can  easily  verify  that  the  inequality 


D  "*"  <  e 
n  — 


i:3  satisfied  if  and  only  if 


H(X.)  <  -^  +  £   for  i  =  1,  2 n 

i  —  n  '   '       ' 


is  true  (refer  to  Fig.  1),  Hence 


P(V-  ^  I   ^^""^^ 


-  P(H(X^)  1-^+  e,  i  »  1,  2 n  |  G(x)) 


(7)  -  P(X^  <.ir^  (^+  e),  i  =  1,2 n  I  G(x)) 


=  P(G(X^)  <.G[ir^  (^+  e)],  i  =  1,2 n), 

where  H    is  the  inverse  function  of  H. 

We  recall  that  since  G(x)  is  continuous,  the  new  random  variable 
Z  =  G(X)  has  the  uniform  distribution  in  the  interval  [0,  1].  Hence  the 
Z.  =  G(x.)  are  independent  order  statistics  drawn  from  a  population  with 

the  uniform  distribution  in  the  interval  [0,  1],   So  we  obtain 

(8)   1  -  Q  =  P(Z^^G[H~^  i^+   £)],  i  =  1 n  |U(Z)), 

where  U (2)  is  the  uniform  distribution  function  in  the  interval  [0,  1], 
By  the  fact  that  the  probability  element  of  (Z,,  Z- Z  )  is 

nl  dz  •dz„...dz   for  z,  <  z-  <  .,,  <  z 
X       z  n  12         a 

and 

0»dzTdz-...dz      elsewhere, 
1  z     n  ' 


10 


ve  conclude 


R(e)  r   R(J+e) 


Q  «  Power  =  1  -  nl 


■R(^^c) 


n-1 


dZ  . . .dZ^dZ. . 
n     2  1 


where 


lim    GEh'-'-Cv)] 
(    0<v+0 


(9) 


R(v)  «=  ^ 


G[h"^(v)] 


,-1. 


v.  lim   G[H  "(v)] 

1>V>1 


for  V  <  0 


for  0  <  v<  1 


for  V  >  1 


Bimbavnn  (1953)  found  the  greatest-  lower  bound  for  the  power  of  this 
test  under  the  assimption  that 


(10) 


sup     [H(x)  -  G(x)]  »  6  >  0 
-«><x<-H» 


aiid 


CD 


H(Xq)  -  G(Xq)  -  6 


He  established  that 


(12) 


Power  >  I     (^)  U^  (1  -  u  )''"^  , 
i=0 


wrere  U  -  G(x  ) ,  j  =  [n(v-  -  e) ]  and  v_  =  H  (x J . 


He  also  gave  the  least  upper  bound  for  the  power,  namely. 


11 


Power  <  1   for  e  <  6 


cind 


k  .  n-i        i-1 

(13)    power  <  (e-6)  I     (")  (l-e+6  - -)    (e-(S+ -)    for  e  >^  6, 

i-0  ^                       ^ 

where  k  =  [n(l-e+6)]  . 

In  fact,  the  right-hand  side  of  (12)  is  the  power  when  G(x)  is  such  that 


H(Xq)  -  6 


G(x)  =•  G  (x) 


for  X  _<  x^ 


for  X  >  X, 


and  the  right-hand  side  of  (13)  is  the  power  when  G(x)  is  such  that 


** 


G(x)  =  G  (x)  =  max  [H(x)  -6,0] 


* 
G  (x) 


Figure  2  . 


In  practice,  there  exists  neither  G  (x)  nor  G  (x)  as  a  distribution 

function.  However  we  can  construct  a  G(x)  arbitrarily  close  to  G  (x)  or 

** 
G  (x)  . 


12 


'■.     Lower  Bound  for  the  Power  under  Certain  Assumptions, 

In  this  section  we  make  a  slight  modification  of  the  assumptions  of  (10) 
end  (11),  and  we  will  find  the  greatest  lower  bound  for  the  test.  Occasionally 
it  is  plausible  to  make  this  modified  assumption,  and  in  such  cases  the  lower 
bound  given  by  (12)  may  sometimes  be  sharpened. 

Let  us  assume  that 
(14)  H(x)  >.  G(x)        for  all  x 

£nd 


(15) 


H(Xq)  -  G(x^^)  =  d  , 


Under  this  assumption  we  can  find  the  greatest  lower  bound  for  the  power  of 
the  foreraentioned  test.  To  see  this  let 

6(x)  -  H(x)  -  G(x)   . 
Then  it  can  be  seen  that  d  =  <S(x  )  and 


G[ii-l(izl  +  e)j  „  izi  +  e  _  6[H"^(i^  +  e)],  (see  Fig,  3) 


H(x)_ 


6[H"\~^+e)] 


[H-l(ili+e)] 


0   1  H  -^(i^ 
n 


+£) 


Figure  3  . 
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From  (7)  and  (8)  we  have 

(L6)        P(D  ■*■  <  £  I  G(x)) 


n 


-  P(Z.  <  ^+  e  -  6[H~^(-i^+  £)],  i  -  1....,  n  |  (J  (Z))  . 

The  probability  (16)  will  give  its  maximum  value  when  G(x)  is  such  that 
tiie  values  of  6  [H  ( +  e)]  are  small  as  possible  for  all  i  under 

consideration.  This  would  occur  when  6(x)  is  very  close  to  6^(x)  defined 


by 


H(x)  -  G(Xq) 


-1 


for  H   (G(Xq))  <  X  _<  Xq 


fi^Cx) 


0  elsewhere. 

In  fact,  for  any  G(x)  under  the  given  assumption  we  have 


where 


?(J)^^  1  e  I  G(x)) 


i-1 


<  P(Z^  <  —  +  e  -  c^,  i  -  1.2..,..n  [  U(Z))^ 


(17) 


c.  =  < 

1 


^  +  e  -  G(Xq) 


for  i  <  k 


for  k  <  i  £  ji 


for  £  <  i  ^  n, 


k  -  [n(G(xQ)  -  e)  +  1],  and  i  "    [n(H(xQ)  -  e)  +  1],  (see  Fig.  A) 
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Hence  v;e  have 


Figure  4, 
P(D  "^  1  e  I  G(x)  ) 


n 


-  +e 

n 


(]8)  <  n  I 


'0  ''z. 


k-1 
n 


^k-1 


+e  r  h 


b  ,'-  +e 

n 


'^i-1 '  h 


m-l 
n 


"m-l 


+z    r  1 


m 


n-1 


'^^n-  •  -^^m+l^^m-  *  '^^i+l^H  '  *  •^^k+l'^^*  *  '^22^21. 


where  b  =  G(x„)  and  m  =  [n(l-e)  +  1]  , 

Since  the  power  of  the  test  is  the  complementary  probability  of 


^(°n  i.  £  I  GCx)  )»  we  obtain: 

THEOREM.  Under  the  assumption  of  H(x)  _>  G(x) ,  for  all  x,  with 
H(x  )  -  G(Xq)  -  d,  the  greatest  lower  bound  for  the  power  of  the  D 

test  is 


(19) 


^-V 


where  P^  is  given  in  the  right-hand  side  of  (18) ,  and  also  by  (20) . 
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One  can  see  that,  when  d  in  (15)  equals  6  in  (10),  the  greatest 
].ower  bound  given  by  (19)  is  greater  than  or  equal  to  that  of  Bimbaum 
^iven  by  (12) ,  because  G  (x)  in  Fig.  2  is  greater  than  or  equal  to  the 
corresponding  curve,  Min  (G  (x) ,  H(x)),  as  indicated  in  Fig,  4.  This  fact 
is  not  contrary  to  our  intuition  because  the  assumption  of  G(x)  given 
by  (14)  is  more  restrictive  than  the  assvimption  of  G(x)  given  by  Bimbaum, 

The  Integration  of  P  is  tedious,  but  straightforward,  and  one  may 


verify  the  following  results,  step  by  step. 


^1- 


n-m 

vx 


,      ,  v.n-m)  1 

m     n-1 


"£      Vl 


(1-z^)       (-+e-zpm-l  n-j        j-£-l 

n ^   [(H^jd^..^)    (1+e-z)     ], 

(n-Ji)l      (n-i)!    j=£   ^-j    n       n     / 

Now  let  ?„,  and  P-_  be  the  first  and  second  term  of  the  right-hand  side 

of  the  last  expression  respectively,  then  we  get 

z'  b    -  b  ,T    .n-k 

P-,  "    I        •••  /     P„,dz. ..,dz,  ,T  = 

''  J,  i  ''  '     ^-"^    (n-k)l 


^  Y"'  (":^)  (i-b)^-^-J  (b-z,)^ 


(n-k)I  i-0  ^  ^ 
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and 


1    rt(l-f-er^b>z,)^-^ 


(n-k)I  'n-l'^"   n  "'   "   k' 


|i!-k-i 
i=l   (n-k)!     ""*'"  j=£+l 


,'f±:!^L_,-^,t    [Oa-i-.)-^ 


*^+e-b)-^       (e-b  +  — -)  J 


<n 


m-l 


(n-k)!  ^i^^^  Vj^^"-   n       'n     k^     ^   k  n 
Again  letting  P-,,  ^^  and  P32    be  the  kth  term  of  the  right-hand  sides  of 
P,  and  P„„  respectively,  we  can  express  P   as: 

P  =ni  fV""  '      [  "^    \v  ^^K?  ^^K?  <^^+p   ^^K?  ^^h 

P^  =  n!     /       .../        (P3^   +^33^   +1'32   +^-32   +^32   ^ 
Jo    Jz^  Jz^_^ 


dz^  . . .dz^dz.  . 
k     2  1 

By  performing  integrations  term  by  term,  we  obtain  the  final  result 


27 


k-1  .   j-1  n-j 

(::0)     P     =  1  -  e  ^      (^(e  +^)        (1  -^-  e) 


m-1 


-e     I       (")(l-:^-e)-^(^+e) 


J-1 


j-i+l 


k  ,_L-  .  4  1     •   £-k+i  ,     .   k-l-i 

-  =     I   <.-!>  O)  (^  -  ^  ->"     <^-=-  ¥>  <=  ^  ^> 

i=l 


n 


-  0(l-^-e)"-^b 


Ji-k-1 


-     Z        (.l)(l-b) 


n-k-j  ,  j+k 


3=0 


j+k' 


i=0  j=0         1     ^+3   -L  » 


k   4-k  m-1 


(i-k-i+r 


V     r       T  r /  n  \  /H-k+rs  ,,  k-r., -  -  "• '  k-r^ 

-  e     III  [(j^_^)  (^_^^.)  (b-e-  — )  (e  +  — ) 

r=l  x=l  j=jl+l 


k-l-r 


•  K(i.  j)] 


£-k     m+1 


n     s   v^-i  . 


1=1     3=^+1 
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where     K(i,j)   -  Cl[f)a  -  ^  -  e)^"^    (1  +  e  -  b)  (e  -  b  +  ^)    . 

If     k  =  X ,     c.   =  0  in  (17),   and  we  have  a  simple  form  for  P^,   resulting 
in 

p„  - 1  -  e  T  (^)  (1  - '  -  y^  c=  *  i)^"'   . 

in  agreement  with  the  theorem  in  section  2. 
5.  Some  Numerical  Results  for  the  Power. 

For  the  purpose  of  making  a  comparison  between  the  power  of  the  test 

ba£:ed  on  D   and  that  of  a  parametric  test,  let  us  consider  the  following 
n 

hypotheses : 

H  :   G(x)  «  N   „(x)   , 

where  u     =u+K.  K>0,  and  N   (x)  denotes  the  normal  distribution  function 
1       '  li,a 

with  mean  y,  and  standard  deviation  a. 

We  draw  a  sample  of  size  n  from  the  population  considered.  Let  X^, 

X  , ...,  X  be  the  ordered  sample  and  F  (x)   be  the  empirical  distribution 

function  determined  by  the  sample.   For  a  test  of  size  a,  choose  a 
corresponding  value  of  e  from  Table  1. 
From  (8) ,  we  have 

(21)   l-Q  =  P(Zi  <N^+K,at\.a"'"  ^^  ^  ^^  ^ '  ^  =  ^ ^  '  ^  ^=^>   >• 


Let 


\,o  "'^y^  =  -0  • 
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Tiien  we  can  write 


0 

n(t;y,a2)   dt 


n(t;   0,   1)    dt, 


wiere  n(t,  y,  o^)  is  the  normal  density  function  with  mean  y,  variance  a' 
Hance  we  have 

-1    ""o"^ 

wiich  is  written  as 


(22)  N^^;\y)  =  aNQ  J\y)  +  y. 


.x^En 


From  (22)  and  with  the  fact  that  N  ^(x)  =  N^  -^i—^)* 


we  have 


<22>     VK.a^,a'(y>^  =  VK,at°No,l'(y>  "^  ^^ 


From  (21)  and  (23) ,  we  can  write 


1-Q-P(Z.  <NQ^^[NQj\i^+e)  -f].  i  =  l n  |  U  (z)  ), 
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where  N  "  (r^  +   e)  is  similarly  defined  as  (9) . 
Tlierefore  we  have  the  power  function 

(:!4)       Q  =•  1  -  n!  ...  I     dz^...dz2dz^, 

where  N^  =  N^  ^[N^  "^  (r^  +  e)  -  -] ,  i  =  1,  2 n  . 

We  notice  that  under  the  given  hypotheses,  the  power  fxinction  is 
independent  of  actual  values  of  u  and  a,  but  it  depends  on  the  value  of 

K      ^r^ 

To  illustrate  the  use  of  (24) ,  let 

K 
n  =  5,   a  =  ,10  and  —  =  1  , 

From  Table  1   e  =  .4470,  and  then  we  obtain 

-1, 


N 


1  "  ^0  i^^o  1  ('^^^"^  -  ^^  =  ^'o.i^"^*^^^^  °  '^^^^ 


-1 


^2   "  ^0  l^^'o  1  ^'^^^^^  -  1^  =  ^0,1^"-^^^^  "   '^^^^ 


-1 


N3  =  Nq  ^[Nq  ^•'(.8470)  -  13  =  Nq^j,  [.024]  =  .5096 

\  =•  ^o.it^o.i'^^-°^^°^  -  ^^  ^0,1^+  "^  =  ^ 
^  =  ^'o,it^o.l'(^-2^^°>  -  ^^  "  ^o,it+  "^  "  ^- 

By  replacing  these  values  for  N.'s  in  (24),  and  after  a  little  calculation, 
W2  have  the  power 

Q  =  .7495. 
Table  2  is  the  result  of  several  such  calculations,  and  gives  the  values 
of  power  for  a  «  .10,  .05  and  .01  when  n  =  5,  with  —  •*   .5,  1.0,  1.5  and  2.0  . 
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Table  2.  Values  of  the  power  using  D 


K/a  a  =  .10  a  =  .05  a  =  .01 

.5  31.65%  25.09%  8.56% 

1.0  74.95  61.57  33.16 

1.5  94.93  89.50  68.19 

2.0  99.50  97.89  .       90.53 

The  most  powerful  parametric  test  under  the  same  hypotheses  would  be 
the  following. 
By  noting  that 

(25)              a  -  P(X^  >  c  I  N(x;  y.  a^)) 


P(Z  >  -^^^     I  N(z;  0,  D)  , 


we  have,  for  the  power, 


(26)  Q  =  P(X^  >  c  I  N(x;  y  +  K,  o^)) 


=  P(z  >  -^^^i^  I  N(z;  0,  D)    . 
a//a 

If  we  let  a  =  .10,  n  =  5  and  ■^  =  1,  then  from  (25)  and  (26)  we  have 


-^^  -  1.282 


i//^ 


^"^"^  =  1.282  -  /5  =  -  .954, 
a//n 
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cind  therefore 

Q  •=  .8299  . 
The  following  Table  3  is  the  values  of  the  power  of  the  parametric 
test  with  the  same  sizes  of  a,  n  and  —  as  are  in  Table  2. 

Table  3.   Values  of  the  power  using  the  most  powerful  test. 

-  a  =  .10  a  =  .05  a  =  .01 

o 

.5  43.48%  29.91%  11.35% 

1.0  82.99  72.27  46.41 

1.5  98.09  95.58  84.80 

2.0  99.93  99.77  98.40 

By  comparing  Table  2  with  Table  3,  we  see  that  the  one-sided  D   test 
turns  out  to  be,  under  the  hypotheses  of  normal  distributions  with  equal 
variances,  less  powerful  than  the  one-sided  classical  test.  A  similar  result 
vas  obtained  by  Van  Der  Wearden  (1953),  when  H(x)  is  normal  with  mean  0, 
variance  1,  and  G(x)  is  normal  with  mean  y  >  0,  variemce  1,  for  n  =  2,3,5 
and  for  a  =  .01  . 

However  it  should  be  noted  that  the  comparison  is  not  quite  fair.  The 
Kolmogorov-Smimov  test  may  be  used  when  the  actual  functional  form  of  the 
distribution  is  not  known,  whereas  the  classical  parametric  test  is  used 
when  the  functional  form  is  known  and  only  a  parameter  is  unknown.  As  van 
der  Wearden  noted,  if,  for  instance,  the  true  distribution  is  normal  with 
mean  0  and  variance  much  smaller  than  1,  Kolmogorov-Smimov  test  may  enable 

4 

us  to  reject  the  hypothesis  that  variance  equals  1,  whereas  the  classical 
test  used  in  this  section  is  quite  useless  for  this  purpose. 
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Vhen  the  functional  forms  of  H(x)  and  G(x)  are  known  one  can  construct 
£  more  powerful  test  of  two  simple  hypotheses  than  that  based  on  D  .  If 

the  hypotheses  are  composite  it  may  not  be  the  case.  The  usefulness  of  the 

+ 
test  based  on  D   is  that,  with  a  small  loss  of  power,  we  have  our  test  for 

all  continuous  distributions.  The  test  is  distribution  free. 
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ABSTRACT 

The  power  of  the  one-sided  and  one-sample  Kolmogorov-Smirnov  test  is 
studied  in  this  report. 

Let  F  (x)  be  the  empirical  distribution  function  determined  by  an  ordered 
sample  of  size  n  dra\ra  from  a  population  with  a  continuous  distribution 
function  F(x)  which  is  unknown.   Then,  for  an  alternative  G(x) ,  the  power 

is  defined  as 

Pr[D^  <.  e  (a,n)l  G(x)], 

where  d"^  =   sup   [F(x)  -  F  (x)],  and  e(a,n)  is  some  constant  which  depends 
^   _<»<x<-H» 

OE  a  (level  of  significance)  and  n  . 

The  main  difficulty  of  studies  on  the  power  for  the  test  (in  general, 
fcr  all  non-parametric  test^  is  how  to  select  the  alternative  hypothesis 
from  among  all  possible  alternative  hypotheses. 

Bimbaum  (1953)  gave  the  greatest  lower  bound  and  the  least  upper  bound 

fcr  the  test  under  the  assumption  that 
I  ■  ■ 
i  sup   [F(x)  -  G(x)]  =  6 

— <»<x<«> 


and 


F(x^)  -  G(x^)  =  5  . 


Under  a  slight  modified  assumption  of  the  above,  the  greatest  lower 
bound  for  the  test  is  found. 

The  power  for  the  test  is  compared  with  the  power  for  a  parametric  test 
under  the  assumption  of  normal  distributions  with  equal  variances  for 
a  =  .10,  .05,  and  .01  when  n  -  5.  The  result  of  the  comparison  is  that  the 
Kolmogorov-Smirnov  test  is  less  powerful  than  the  parametric  test  .considered. 
Needless  to  say,  a  non-parametric  test  is  a  tool  which  may  be  used  when  the 
functional  form  of  the  hypothesis  tested  is  not  known. 


